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Cowles  Foundation. 


1 .   Introduction 

In  the  pioneering  research  in  econometrics  done  at  the  Cowles 
Foundation,  estimation  techniques  for  simultaneous  equations  models  were 
studied  extensively.   Maximum  likelihood  estimation  methods  were  applied  to 
both  the  single  equation  case  (LIML)  and  to  the  complete  simultaneous 
equations  models  (FIML).   It  is  interesting  to  note  that  while  questions  of 
identification  were  completely  solved  for  the  case  of  coefficient 
restrictions,  the  problem  of  identification  with  covariance  restrictions 
remained.  Further  research  by  Fisher  (1966),  Rothenberg  (1 971  ) ,  and  Wegge 
(1965)  advanced  our  knowledge  in  this  field,  with  Fisher's  examination  of 
certain  special  cases  especially  insightf\il.   In  a  companion  paper,  Hausman 
and  Taylor  (1983),  we  give  conditions  in  terms  of  the  interaction  of 
restrictions  on  the  disturbance  covariance  matrix  and  restrictions  on  the 
coefficients  of  the  endogenous  variables  for  the  identification  problem. 
What  is  especially  interesting  about  our  solution  is  that  covariance 
restrictions,  if  they  are  to  be  effective  in  aiding  identification,  must  take 
one  of  two  forms.  First,  covariance  restrictions  can  cause  an  endogenous 
variable  to  be  predetermined  in  a  particiilar  equation,  e.g.,  a  "relatively  . 
recursive"  specification.   Or,  covariance  restrictions  can  lead  to  an 
estimated  residual  from  an  identified  equation  being  predetermined  in  another 
equation.  Both  of  these  forms  of  identification  have  ready  interpretations 
in  estimation  as  instrumental  variable  procedures,  which  links  them  to  the 
situation  where  only  coefficient  restrictions  are  present. 


For  full  information  maximum  likelihood  (FIML),  the  Cowles  Foundation 
research  considered  the  case  of  covariance  restrictions  when  the  covariance 
matriz  of  the  residuals  is  specified  to  be  diagonal  (Koopnans,  Rubin,  and 
Leipnik  (1950,  pp.  154-211)).  The  case  of  a  diagonal  covariance  matrix  is 
also  analyzed  by  Malinvaud  (1970,  pp.  678-682)  and  by  Rothenberg  (1973,  pp. 
77-79  and  pp.  94-115),  who  also  does  FII'IL  estimation  in  two  small 
simultaneous  equations  models -to  assess  the  value  of  the  covariance 
restrictions.  But  covariance  restrictions  are  a  largely  unexplored  topic  in 
simiiltaneous  equations  estimation,  perhaps  because  of  a  reluctance  to  specify 
a  priori  restrictions  on  the  disturbance  covariances.^  However,  an  important 
contributory  cause  of  this  situation  may  have  been  the  lack  of  a  simple, 
asymptotically  efficient,  estimation  procedure  for  the  case  of  covariance 
restrictions.   Rothenberg  and  Leenders  (1964),  in  their  proof  of  the 
efficiency  of  the  Zellner-Theil  (1962)  three  stage  least  squares  (3SLS) 
estimator,  showed  that  the  presence  of  covariance  restrictions  would  make 
FML  asymptotically  more  efficient  than  3SLS.   The  Cramer-Rao  asymptotic 
lower  bound  is  reduced  by  covariance  restrictions,  but  3SLS  does  not 
adequately  account  for  these  restrictions.  The  reason  for  this  finding  is  ^ 
that  simply  imposing  the  restrictions  on  the  covariance  matrix  is  not 
adequate  when  endogenous  variables  are  present  because  of  the  lack  of 


^.   Of  course,  at  a  more  fundamental  level  covariance  restrictions  are 
required  for  any  structural  estimation  in  terms  of  the  specification  j)f 
variables  as  exogenous  or  predetermined,  c.f.  Fisher  (1966,  Ch.4). 


block-diagonality  of  the  information  matrix  between  the  slope  coefficients 
and  the  unknown  covariance  parameters.   In  fact,  imposing  the  covariance 
restrictions  on  the  conventional  3SLS  estimator  does  not  improve  its 
as3niiptotic  efficiency.   Thus  efficient  estimation  seemed  to  require  FIKL.^ 
The  role  of  covariance  restrictions  in  establishing  identification  in  the 
simultaneous  equations  model  was  not  fully  understood,  nor  did  imposing  such 
restrictions  improve  the  asymptotic  efficiency  of  the  most  popular  full 
information  estimator.   Perhaps  these  two  reasons,  more  than  the  lack  of  a 
priori  disturbance  covariance  restrictions,  may  have  led  to  their  infrequent 

use. 

Since  our  identification  results  have  an  instrumental  variable 
interpretation  it  is  natural  to  think  of  using  instrimental  variables  as 
an  approach  to  estimation  when  covariance  restrictions  are  present. 
Hausman  (1975)  gave  an  instrumental  variables  interpretation  of  FIML  when 
no  covariance  restrictions  were  present,  which  we  extend  to  the  case  with 
covariance  restrictions.   The  interpretation  seems  especially  attractive 
because  we  see  that  instead  of  using  the  predicted  value  of  the 
endogenous  variables  based  only  on  the  predetermined  variables  from  the 
reduced  form  as  instruments,  when  covariance  restrictions  are  present, 
FIML  also  uses  that  part  of  the  estimated  residual  from  the  appropriate     ^,^ 
reduced  form  equation  which  is  uncorrelated  with  the  residual  in  the  equation 
where  the  endogenous  variables  are  included.   Th\is  more  information 


^.  Rothenberg  and  Leenders  (1964)  do  propose  a  linearized  mnyi  mum  ^likelihood 
estimator  which  corresponds  to  one  Hewton  step  beginning  from  a  consistent 
estimate.   As  usual,  this  estimator  is  asymptotically  equivalent  to  FIML. 
Also,  an  important  case  in  which  covariance  restrictions  have  been  widely 
used  is  that  of  a  recursive  specification  in  which  FUEL  coincides  with 
ordinary  least  squares  (OLS). 


is  used  in  foming  the  instriments  than  in  the  case  where  covariance 
restrictions  are  absent. 

A  slight  variation  on  the  instrvimental  variables  theme  yields  a  useful 
alternative  to  FIML.   Madansky  (1964)  gave  an  instrumental  variable 
interpretation  to  3SLS  and  here  we  augment  the  5SLS  estimator  by  additional 
equations  which  the  covariance  restrictions  imply.   That  is,  a  zero 
covariance  restriction  means  that  a  pair  of  disturbances  is  uncorrelated,  and 
therefore  that  the  product  of  the  corresponding  residuals  can  itself  be  used 
in  estimation  as  the  residual  of  an  additional  equation.   These  additional 
equations  are  nonlinear  in  the  parameters  but  can  be  linearized  at  an  initial 
consistent  estimator,  and  then  3SLS  performed  on  the  augmented  equation 
system.   This  estimator,  which  we  call  augmented  three  stage  least  squares 
(A3SLS),  is  shown  to  be  more  efficient  than  the  3SLS  estimator  when  effective 
covariance  restrictions  are  present.   We  also  consider  convenient  methods  of 
using  the  extra  equations  which  are  implied  by  the  covariance  restrictions  to 
form  an  initial  consistent  estimator  when  the  covariance  restrictions  are 
necessary  for  identification. 

To  see  how  efficient  the  A3SLS  estimator  is,  we  need  to  compare  it  to 
FIML  which  takes  account  of  the  covariance  restrictions.   The  instrimiental 
variable  interpretation  of  FIML  leads  to  a  straightforward  proof  that  the   ^ 
A3SLS  estimator  is  asymptotically  efficient  with  respect  to  the  FIML 
estimator.  The  A3SLS  estimator  provides  a  computationally  convenient 
estimator  which  is  also  asymptotically,  efficient.   Thus  we  are  left -with 
attractive  solutions  to  both  identification  and  estimation  of  the  traditional 
simiiltaneous  equations  model. 

In  addition  to  the  development  of  the  AJSLS  estimator,  we  also 
reconsider  the  assignment  condition  for  identification  defined  by  Hausman  and 


Taylor  (1983)-  ^^   prove  that  the  assignment  condition  which  assigns 
covariance  restrictions  to  one  of  the  two  equations  from  which  the 
restriction  arises  provides  a  necessary  condition  for  identification.   The 
rank  condition  provides  a  stronger  necessary  condition  than  the  condition  of 
Fisher  (1966).   These  necessary  conditions  apply  equation  by  equation.   We 
also  provide  a  sufficient  condition  for  local  identification  in  terms  of  the 
structural  parameters  of  the  entire  system. 


2.   Estimation  in  a  Two  Equation  Model 

VJe  begin  with  a  simple  two  equation  simultaneous  equation  model  with  a 
diagonal  covariance  matrix,  since  many  of  the  key  results  are  straightforward 
to  derive  in  this  context.   Consider  an  industry  supply  curve  which  in  the 
short  run  exhibits  decreasing  returns  to  scale.   Quantity  demanded  is  thus  an 
appropriate  included  variable  in  the  supply  equation  which  determines  price , 
y.  ,  as  a  function  of  quantity  demanded,  y- .   Also  included  in  the 
specification  of  the  supply  equation  are  the  quantities  of  fixed  factors  and 
prices  of  variable  factors,  both  of  which  are  assumed  to  be  exogenous.   The 
demand  equation  has  price  as  a  jointly  endogenous  explanatory  variable 
together  with  an  income  variable  assiMed  to  be  exogenous.  We  assume  the 
covariance  matrix  of  the  residuals  to  be  diagonal,  since  shocks  from  the 
demand  side  of  the  market  are  assumed  to  be  fully  captured  by  the  inclusion 
of  yo  in  the  supply  equation.  The  model  specification  in  this  simple  case 
is 


(2.1)     y^  =  ^^23^2  "^  "^11^1  *  ^1 


(2.2)     72  =  ^21^1  "^  ^22^2  "^  ^2 


where  we  have  further  simplified  by  including  only  one  exogenotis  variable  in 
equation  (2.1).  We  assume  that  we  have  T  observations  so  that  each  variable 
in  equations  (2.1)  and  (2.2)  represents  a  T  X  1  vector.   The  stochastic 
assumptions  are  E(e.  jz^tZp)  ~  ^  ^°^   i=1 r2,  var(E.  |z.  ,22)  =  a..,    cov(E.e2l 

z^  ,Z2)  =  °''j2  "  ^" 

Inspection  of  equations  (2.1)  and  (2.2)  shows  that  the  order  condition 
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is  satisfied  so  that  each  equation  is  identikit   by  coefficient  restrictions 
alone,  so  long  as  the  rank  condition  does  not  fail.   If  the  covariance 
restriction  is  neglected,  each  equation  is  ju.  .-identified  so  that  3SLS  is 
identical  to  2SLS  on  each  equation.   Note  tha.  .or  each  equation,  2SLS  uses 
the  instruments  W.  =  (Zn.,z.),  i^2,    where  Z  "   ..  ,^2)    and  11 .  is  the  vector 
of  reduced  form  coefficients  for  th.  (other "^    luded  endogenous  variables. 
To  see  how  FIML  differs  from  the  instrimentalL  .^.ziables  (IV)  estimator,  we 
solve  for  the  first  order  conditions  of  the  likelihood  function  under  the 
assumption  that  the  e.'s  are  normally  distributed.   Of  course,  as  is  the  case 
for  linear  simiiltaneous  equation  estimatio:  ..l\;h  only  coefficient 
restrictions,  failure  of  the  normality  assv.-;:ion  does  not  lead  to 
inconsistency  of  the  estimates.  For  the  tv.  scuation  example,  the  likelihood 
function  takes  the  form 

(2.3)     L  =  c  -|  log  (^11^22^  *  "^  ^°^   '^""^12^21  ' 

-  i  r_l_  (t  -^  6  )  •  (y  -X  6  )  +  1 (y  -X  6  ) '  (y  -X  6  )1 

2  '■a,,  ^-^1   1  r^  ^•'^l   1  r    a„-^-^2  2  2^  ^^2   2  2^^ 

where  c  is  a  constant  and  the  X.'s  and  6^- 'r--  contains  the  right  hand  side 
variables  and  unknown  coefficients  respectively,  e.g.,  X^  =  (y2,z^)  and 

To  solve  for  the  FIML  estimator,  we  find  the  first  order  conditions  for 

Of  course,  because  of  the  condition  of  just  identification,  a  nvperically 
identical  result  would  be  obtained  if  instruments  Wj^  =  (z<  ,7.2     were 
used.  --  ^  ^ 
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equation  (2.1);  results  for  the  second  equation  are  identical.   The  three 
first  order  conditions  are 

Tp, 


(2.4a)    -  T-7^     +  -pr^  (yi-X.6J'yp 


(2.4b)    J_(y   X  6  )'z   =  0 
11 


(2.4c)    _-^  +  _^(y^-X^6^)'(y^-X^6^)  =  0 
^^    ^11 

Rearranging  equation  (2.4c)  yields  the  familiar  solution  for  the  variance, 
a^^=  (1/T)(y^-  X^6^)'(y^-  X^6^).  Equation  (2.4b)  has  the  usual  OLS  form 
which  is  to  be  expected  since  z^  is  an  exogenous  variable.  It  is  equation 
(2.4a)  iftiere  the  simultaneous  equations  nature  of  the  model  appears  with  the 

presence  of  — TP2i/(''— ^12^21  ^  which  arises  from  the  Jacobian  term  in  the 
likelihood  function  (see  Hausman(l 975)) • 

Kow  equations  (2.4a)  -  (2.4c)  can  be  solved  by  numerical  methods  which 
maximize  the  likelihood  function.   Koopnans  et.  al.  (1950)  have  a  lengthy 
discussion  of  various  numerical  techniques  for  maximization  which  must  be  one 
of  the  earliest  treatments  of  this  problem  in  the  econometrics  literature. 
But  we  can  solve  equation  (2.4a)  in  a  particular  way,  using  the  reduced  form 
specification  to  see  the  precise  role  of  the  covariance  restrictions  in  the 
model.  We  first  multiply  equation  (2.4a)  by  0^^  to  get 

— TP  c  — P  ^  ^  ^ 

+72*^3^1-^1^1)  =  °- 


We  now  do  the  substitutions  from  the  reduced  form  equations  y.  =  Zn.+  v. 
using  the  fact  that  v^  =  P?!  ^1^  ^''"^1 ''^21  ^  *  ^p^  ^''"^  1 ''^21  ^ '  ^®  transform 
equation  (2.5)  to 


(2-6)     (,zh7^J^2-^2'2^-^2y    ^^^r^1^l) 


12^21 


(zn^^  v^)'  (y^-X^6^)  =  0. 


Canceling  terms,  we  find  the  key  result 


(2.7)    (z  n  +  _,^  ,     )'z     =  0. 
2   1-^^2^21    ^ 


Without  the  covariance  restriction,  we  would  have  the  result 


(2.8)   (zn2)-.^  =  o,    n^  =(  ,_^'   p    i_^  p  )' 

^12^21     ^12^21 


which  is  the  instrumental  variable  interpretation  of  FBIL  given  by  Hausman 
(1975),  equation  (12).  But  in  equation  (2.7),  we  have  the  additional  term 
£2/(1—^^2^21^*  ^s't  iias  happened  is  that  FIML  has  used  the  covariance 
restrictions  to  form  a  better  instrument.  Remember  that  y„  forms  the  best 
instrimient  for  itself  if  it  is  predetermined  in  equation  (2.1).   But  here",  y- 
is  jointly  endogenous  since  it  is  correlated  with  e.  :  from  the  reduced  form 
equation 
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^2       ^21^1 


^12^21     ^12^21 


FIML  cannot  use  the  last  term  in  forming  the  instrument  for  y-p  since 
Ppi  ^l/(''— ^12^21  ^  is  correlated  with  the  residual  e.  in  equation  (2.1).   It  is 
this  last  term  which  makes  yp  endogenous  in  the  first  equation.   However, 
FIHL  can  use  Ep/(l— p. _Pp. )  because  Ep  is  uncorrelated  with  e.  by  the 
covariance  restriction  cr.p=  0.   By  using  this  term,  FIML  creates  a  better 
instrument  than  ordinary  2SLS  which  ignores  Ep/(l-^. ppp. ) .   Our  two  equation 
example  makes  it  clear  why  3SLS  is  not  asymptotically  efficient  when 
covariance  restrictions  are  present.  FIML  uses  better  instruments  than  3SLS 
and  produces  a  better  estimate  of  the  included  endogenous  variables. 

Two  other  important  cases  can  be  examined  with  our  simple  two  equation 
model.  First,  suppose  that  Ppi  ~  ^'  ^^  specification  is  then  triangular, 
and  given  the  diagonal  covariance  matrix,  the  model  is  rec\irsive.  Here,  the 

FIML  instrument  is  ZHn     "*■  "^p  ~  ^2  ®°  that  jp  is  predetermined  and  FIML 
becomes  OLS  as  expected.   The  second  case  returns  to  P>p  *Q   but  sets  702"^' 
The  first  equation  is  no  longer  identified  by  coefficient  restrictions  alone, 
but  it  is  identified  by  the  covariance  restrictions  because  the  FIML 
instruments  are 


^2 


^^•5)   ^1  =  ^^n^-^:^-^,.^) 


Because  of  the  addition  of  the  residual  term  in  W^ ,  the  instrument^ma^trix  has 
fxill  rank  and  the  coefficients  can  be  estimated. 
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FIML  needs  to  be  iterate.,  to  solve  the  first  order  conditions;  in  our 
two  equation  case,  we  see  that  the  original  first  order  condition  (2.4a)  or 
its  tranfonaed  version,  equat'on  (2.7),  is  nonlinear.   But  we  know  if  the 
covariance  restrictions  were  :.ot  present,  3SLS  (or  here  2SLS)  gives 
asymptotically  efficient  ins  -.-ments.   Since  3SLS  is  a  linear  IV  estimator, 
it  is  straightforward  to  cor:   '■ :  and  is  included  in  many  econometric  computer 
packages.   Yet  we  also  know   .i.'c  if  a.p  ^  0  then  3SLS  is  not  asymptotically 
efficient.   Furthermore,  if  j-  -I'^O  and  722"^'  ^'^   would  not  be  clear  how  to  do 
2SLS  on  the  first  equation,  Evince  it  is  not  identified  by  the  coefficient 
restrictions  alone.   If  we  hcJ  to  use  the  2SLS  instruments  y2(  =  Z  lip)  and 
z. ,  the  instrument  matriz  (W  'X.)  would  be  singular  as  expected  since  the 
rank  condition  fails.  The  FIML  solution  which  accounts  for  the  covariance 
restriction  o.p=   0  is  very  5r~gestive.   Suppose  that  €.2''   yp-X^Op  is  used  as 
an  instrumental  variable  fo:  equation  (2.1)  in  addition  to  z^  and  Zp,  as  is 
the  case  for  FIML.   It  folic;::;  immediately  that  the  optimal  estimator  which 
uses  z.  ,    Zo ,  and  Cp  ^^  instnraiental  variables  for  equation  (2.1)  and  z.  and 
Zp  for  the  equation  (2.2)  is  asymptotically  more  efficient  than  5SLS  since  it 
uses  more  instruments.   Unfortxmately,  this  estimator  is  quite  complicated 
because  the  estimated  parameters  which  appear  in  the  instruments  affect  its 
asymptotic  distribution.^ 

Another  approach  to  estimation  which  is  similiar  in  spirit  to 
instrumental  variables  is  to  try  and  utilize  the  eztra  moment  restrictions 


A  full  discussion  of  this  instrimiental  variables  estimator  can  be  found 
in  an  earlier  version  of  this  paper,  Hausman,  Hewey,  and  Taylor  (1983). 
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which  are  implied  by  the  covariance  restrictions.   The  variables  z.  and  Zp 
can  be  used  as  instruments  because  they  are  both  uncorrelated  with  c.    and 
ep.   In  addition  FIML  can  use  £.  as  an  instrument  for  e-  and  £„  ^^  an 
instrument  for  e.  because  ECe^.Ept^  ~  ^*   ^^  ^^^  account  for  this  extra 
moment  restriction  by  augmenting  equations  (2.1)  and  (2.2)  with  the 
additional  equation 


(2.10)    (y^^  -  ^n^^)ij2t  -   %^2)  =  ^t' 


where  e+  has  mean  zero. 

This  additional  equation  is  nonlinear  in  the  parameters,  but  when  an 

A 
initial  consistent  estimator  6  is  available  it  can  be  linearized  around  the 

initial  estimate.  A  first  order  Taylor's  expansion  of  equation  (2.10) 

A 

around  6  gives 


A       A 


A       *<■  ^  fi 


^1t^2t  -  ^2t^1t(^r^l)  -  ^1t^2t(^2°'^2)  =  ^t' 
A  A 

where  ^it^^it'^^it^i  ^'^^   ^t  ^^  equal  to  e^  plus  a  second  order  term. 
Collecting  terms  with  unknown  values  of  the  parameters  on  the  right-hand 
side  gives 


(2.11)    '^^^E2t  ■"   '^2t^1t^1  *  ^1t^2t^2  =  ^2t^1tSl  ■"  ^1t^2t^2  "^  ^f 

The  parameter  vector  6  can  now  be  estimated,  while  accounting  for  ^he 
presence  of  covariance  restrictions,  by  joint  3SLS  estimation  of  equations 
(2.1),  (2.2),  and  (2.11),  imposing  the  cross  equation  restrictions  and  using 
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a  vector  of  ones  as  the  instrvment  for  the  last  equation.^  We  will  refer  to 
this  estimator  as  augmented  3SLS  (A3SLS),  since  the  original  equation  has 
been  augmented  by  an  equation  which  is  generated  by  the  covariance 
restriction.   It  will  be  shown  below  that  in  general  the  A3SLS  estimator  is 
asymptotically  equivalent  to  FIML  when  the  disturbances  of  the  original 
equation  system  are  normally  distributed  and  is  also  efficient  relative  to 
FIiyiL  when  the  disturbances  are  nonnormal. 

Direct  use  of  the  extra  moment  restrictions  also  yields  a  fruitful 
a"pproach  to  estimation  when  covariance  restrictions  are  needed  for 
identification.   For  example,  suppose  that  Yp?"^  ^^   equation  (2.2).   Then  Zp 
is  no  longer  useful  as  an  instrument  because  it  does  not  appear  in  the 
reduced  form.   However,  we  can  still  obtain  an  estimator  of  the  unknown 
parameter  vector  (P-io*  "^ii'  ^21^  ^-^  utilizing  equation  (2.10).   Consider  ah 
estimator  6  which  is  obtained  as  the  solution  to  the  three  equations 

A        A 

(2.12a)   z-i'Cji  -  ^-123^2  ~  '^'h^I^  "  °' 


A 

(2.12b)   Zi'(y2  -  ^21^1^  "  °' 


^A  consistent  estimator  of  the  joint  covariance  matrix  of  the 
disturbances  of  the  augmented  equations  system  can  be  obtained  in  the  usual 
way  from  the  residuals  for  equations  (2.1),  (2.2),  and  (2.11)  with  6=6. 
Also,  z-\    and  ^2  °^^  also  be  used  as  instruments  for  equations  (2.11)  without 
affecting  the  efficiency  of  the  estimator.   See  Section  4. 

^Throughout  the  paper  FIML  will  refer  to  the  estimator  which  is  obtained 
by  performing  maximum  likelihood  under  the  assumption  that  the  disturbances 
of  the  original  equation  system  are  normally  distributed.   If  they  do  not 
have  a  normal  distribution  then  FIML  will  be  a  quasi  maximum  likelihood 
estimator. 
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AAA 

(2.12c)    (72  -  ^21^1^'*^^!  ■  ^12^2  "  "''ll^l^  "  °' 

The  first  two  equations  use  the  instrumental  variable  moment  conditions  that 
z..  is  orthogonal  to  e..  and  Cpt  while  the  third  equation  uses  the  moment 
condition  E[e.,e„, J=0.   This  estimator  is  a  generalized  method  of  moments 
(GMM,  see  Hansen  (1982))  estimator  which  uses  the  moment  condition  implied  by 
the  covariance  restriction  in  addition  to  the  usual  instrumental  variable 
orthogonality  conditions.   Generally  the  solution  to  such  an  equation  will 
require  iteration  because  the  product  of  two  residuals  is  quadratic  in  the 
unknown  parameters,  although  in  this  simple  example,  which  has  a  recursive 
structure  and  is  thus  covered  by  the  results  of  Section  4.2  of  Hausman  and 
Taylor  (1983),  iteration  is  not  required.  Note  that  the  solution  to  equation 
(2.12)  can  be  obtained  by  first  solving  (2.12b)  for  ppi  ^^^   then  solving 

A        A 

equations  (2.12a)  and  (2.12c)  for  p^p  and  Y-ii>  which  amounts  to  first  doing 
2SLS  on  equation  (2.2)  using  z.   as  an  instrument,  and  then  doing  2SLS  on 

A        A 

equation  (2.1)  using  z^  and  ep  =  y2~^21-^1  ^^  instruments. 
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3.  FIML  Estimation  in  the  M-equation  case 

We  now  turn  to  the  general  case  where  zero  restrictions  are  present  on 
some  elements  of  the  covariance  matrix,  but  the  covariance  matrix  is  not 
necessarily  assumed  to  be  diagonal.  We  consider  the  standard  linear 
simultaneous  equations  model  where  all  identities  are  assumed  to  have  been 
substituted  out  of  the  system  of  equations: 

(3.1 )  TB  +  zr  =  U 

where  Y  is  the  TXM  matrix  of  jointly  endogenous  variables,  Z  is  the  TXK 
matrix  of  predetermined  variables,  and  U  is  the  TXM  matrix  of  the  structural 
disturbances  of  the  system.   The  model  has  M  equations  and  T  observations. 
It  is  assumed  that  B  is  nonsingular  and  that  Z  is  of  full  rank.  ¥e  assvime 
that  plim  (1/T)  (Z'U)  =  0,  and  that  the  second  order  moment  matrices  of  the 
current  predetermined  and  endogenous  variables  have  nonsingular  probability 
limits.   Lastly,  if  lagged  endogenous  variables  are  included  as  predetermined 
variables,  the  system  is  assumed  to  be  stable. 

The  structural  disturbances  are  assumed  to  be  mutually  independent  and 
identically  distributed  as  a  nonsingular  M-variate  normal  distribution: 

(3.2)  U~N(O,Z0Iy) 

where  Z  is  positive  definite.  However,  we  allow  for  restrictions  on.  the 
elements  of  1   of  the  form  o.-=   0  for  i^j,  which  distinguishes  this^-J'rom 
the  case  that  Hausman  (1975)  examined.   In  deriving  the  first  order 
conditions  for  the  likelihood  function,  we  will  only  solve  for  the  unknown 
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elements  of  T.   rather  than  the  complete  matrix  as  is  the  usual  case.   Using 
the  results  of  Hausman  and  Taylor  (1982)  and  Section  5,  we  assume  that  each 
equation  in  the  model  is  identified  by  use  of  coefficient  restrictions  on  the 
elements  of  B  and  T   and  covariance  restrictions  on  elements  of  Z. 
We  will  make  use  of  the  reduced  form  specification, 


(3.3)     Y  =  -ZrE"""  +  UB"""  =  Z  n  +  V. 


As  we  saw  in  the  last  section,  it  is  the  components  of  v.  =  (UB~  ) .  that  give 
the  extra  instruments  that  arise  because  of  the  covariance  restrictions.   The 
other  form  of  the  original  system  of  equations  which  will  be  useful  is  the 
so-called  "stacked"  form.  We  use  the  normalization  rule  B. .=  1  for  all  i  and 
then  rewrite  each  equation  in  regression  form  where  only  unknown  parameters 
appear  on  the  right-hand  side: 

(3-4)     li  =  1^6^+  u^ 

•where  X^  =  [y^,Z^],  6^'  =  [Pj^',Yj_'],  Y^  is  the  Tzr^^  matrix  of  included 
endogenous  variables,  Z-  is  a  Tzs.  matrix  of  included  predetermined 
variabiles,  and  6.  is  the  1^  ~  ^^  "^  s.  dimensional  vector  of  structural      --- 
coefficients  for  the  ith  equation.   It  will  prove  convenient  to  stack  these  M 
equations  into  a  system 

(3-5)    y  =  X6  +  u   ' 


where 
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,  X  =  diag[X^  ,...,X^] 


X.  ...0 


0  '"X 


['0 

^1 

,   6   = 

• 

,    u  = 

• 
• 

• 

• 

[Sj 

/m_ 

Note  that  6  is  the  q  =  ^a^.a    dimensional  vector  of  structural  coefficients. 
Likewise,  we  stack  the  reduced  form  equations 

(3.6J     y  =  "ai  +  V 

where  "2  =  [l  @z]  and  TI  =  vec(n)  is  the  vector  of  reduced  form 
coefficients. 

The  log  likelihood  function  arises  from  the  model  specifiction  in 
eqioation  (3»1)  and  the  distribution  assumption  of  equation  (5«2): 

(3.7)     L(B,r,l)  =  c  +  I  log  det(z:)""'  +  T  log  |  det  (B)  | 


-  ^  tr[l  z"''  (YB  +  zr)'  (IB  +  zr)] 


where  the  constant  c  is  disregarded  in  maximization  procedures.  We  now 
calculate  the  first  order  necessary  conditions  for  a  maximum  by  matrix 
differentiation.   The  procedures  used  and  the  conditions  derived  are  the  same 
as  in  Hausman  (1975,  p.  730).  To  reduce  confusion,  we  emphasize  that  we  only 
differentiate  with  respect  to  unknown  parameters,  and  we  use  the  symbol  =  to 


Here  vec(.)  denotes  the  usual  column  vectorization  of  a  matrix. 
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renind  the  reader  of  this  fact.   Thus  the  number  of  equations  in  each  block 
of  the  first  order  conditions  equals  the  number  of  unknovm  parameters;  e.g., 
the  number  of  equations  in  (3.8a)  below  equals  the  number  of  unknown 
parameters  in  B  rather  than  li   .   The  first  order  conditions  are 


(3.8a)     -1^  :  T(B')~''  -  Y'(YB  +  ZF)!"''  ^  0, 
oB 


(3.8b)     -^  :  -  Z'(YB  +  ZF)!"^  ^  0, 

ar 

(3.8c)    -^  :  TZ  -  (YB  +  Zr)'(YB  +  ZF)  ^   0. 

az~ 


In  particular,  note  that  we  cannot  postmultiply  equation  (3»8b),  or  later, 

the  transformed  versions  of  equation  (3«8a),  to  eliminate  Z~  ,  as  a  simple 

two  equation  example  will  easily  convince  the  reader. 

Let  us  consider  the  first  order  conditions  in  reverse  order.  We  already 

know  some  elements  of  2  because  of  the  covariance  restrictions.  The  unknown 
elements  are  then  estimated  by  a.  .=    (l/T)(y.  -  X.6.)'(y-  -  X  .6  .)  where  the 

6's  contain  the  estimates  of  the  unknown  elements  of  the  B  and  T   matrices. 
Equation  (3.8b)  causes  no  special  problems  because  it  has  the  form  of  the 


^.   An  alternative  procedure  is  to  us  Lagrange  Multiplier  relationships  of 
the  type  0  =  o^j  =  (y^  -  Xj^Sj^)'  (y^  -  Xj  6j)  for  known  elements  of  vZ  but  the 
approach  adopted  in  the  paper  seems  more  straightforward.  Hote  that"  the 
parameterization  of  2  used  to  obtain  equation  (5.8c)  is  the  elements  of  I~ 
corresponding  to  the  distinct ,  nonzero  elements  of  Z .  It  is  straigh"tforward 
to  check  that  the  Jacobin  of  this  transformation  has  full  rank,  so  that 
equation  (3.8c)  follows  by  the  invariance  of  maximum  likelihood. 
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first  order  conditions  of  the  multivariate  regression  model.  We  now  transform 
equation  (3.8a)  to  eliminate  T(B' )   and  to  replace  the  matrix  Y'  by  the 
appropriate  matrix  of  predicted  Y's  which  are  orthogonal  to  the  U  matrix. 
First  we  transform  equation  (3-8a)  by  using  the  identity  ZZ~     =  I: 


(3.9)     [T(B')"''l  -  Y'(TB  +  ZT)]Z~^    ^   0. 


Note  the  presence  in  equation  (3'9)  of  the  term  (B' )  Z  which  is  the  key  term 
for  the  identification  results  in  Hausman  and  Taylor  (1983).   We  now  do  the 
substitution  similar  to  the  one  we  used  for  equations  (2.5)  to  (2.7)  in  the 
two  equation  case.   For  equation  (3.8c),  we  know  that  the  elements  of  Z  take 
one  of  two  forms.   Either  they  equal  the  inner  product  of  the  residuals  from 
the  appropriate  equations  divided  by  T  or  they  equal  zero.   To  establish  some 
notation,  define  the  set  H-  as  the  indices  m  which  denote  for  the  J'th  row  of 
2  that  CT .  =  0.   How  we  return  to  the  first  part  of  equation  (3.9)  and 

Jill 

consider  the  ij'th  element  of  the  matrix  product 


(3.10)    [t(B')-^Z].  .  =  T  Z  P^V  i=  (^'i  -   ^   P^^^V^^n 


where  P^  is  the  ik'th  element  of  the  inverse  matriz  B~  .   Note  that  if  no 
zero  elements  existed  in  column  j  of  Z  we  would  have  v' .u.  on  the  right  hand 
side  of  (3«10),  as  in  Hausman  (1975,  equation  11).  We  now  use  the  expression 
from  equation  (3. 10)  and  combine  it  wth  the  other  terms  from  the  bracketed 
term  in  equation  (3.9): 
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(3.11)   [T(B')-'   -  Y'(YB+Zr)J   =^  [v  -  r    p^^u,-  (Zn.  )-v.]'u. 

^^     ^  k  c  N.    ^     ^    ^    ' 


[zn.  +  z   fi^V  J'u. 

^    k  e  N  ;    ^   ^ 

J 


[  zn .  +  u  .b . .  J  •  u . , 


where  U .  is  the  Tx  N .  matrix  of  residuals  uncorrelated  with  u .  and  b . .  the 

ki 
corresponding  vector  of  p   for  k  in  N.. 

J 

As  with  equation  (2.7)  we  see  that  FIML  replaces  the  jointly  endogenous 


variable  y.  =  ZTI.  +  v.  with  the  prediction  from  the  predetermined  variables 
zn.  and  those  terms  of  v.  =  (UB~  ).  which  are  uncorrelated  with  u.  because  c 
zero  restrictions  on  the  a.  .'s.   Thus  we  rewrite  equation  (3*9)  as 

(3.12)  -  [(z(-rB~^)  +  v)'(YB  +  zr)]z'^   =  0 


where  the  ijth  element  of  ^  is  ll.b.  . . 

1  iJ 


Equation  (3.12)  demonstrates  the  essential  difference  for  FIML  estimation 
which  arises  between  the  case  of  no  covariance  constraints  and  the  present 
situation.  We  see  that  in  addition  to  the  usual  term  ZEI,  we  have  the  extra 
elements  V  which  are  uncorrelated  structural  residuals  multiplied  by  the 
appropriate  elements  of  B~  .  Thus  FIML  uses  the  covariance  restrictions  to 

A 

form  a  better  predictor  of  Y  than  the  usual  Yo  '^^ 
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-1. 


Note  that  if  (B~  Z) .  .  =  0  in  equation  (3. 10),  equations  i  and  j  are 
relatively  recursive.   Then  y.  is  predetermined  in  the  j'th  equation  rather 
than  jointly  endogenous,   and  equation  (3-11)  reduces  to  the  same  form 
as  equation  (3.8b):  columns  of  Y  are  treated  like  columns  of  Z.   In 
general,  however,  y^^  is  replaced  by  a  predicted  value  which  is  composed  of 
two  terms:   a  prediction  of  the  mean  from  the  predetermined  variables  and  an 
estimate  of  part  of  the  reduced  form  disturbance  from  the  uncorrelated 
structural  residuals.   For  future  reference,  we  gather  together  our 
transformed  first  order  conditions  which  arise  from  equations  (3.8a)  - 
(3.8c): 


(5.13a)   - 


-(B'^)T'Z'  +  V' 
Z' 


(YB  +  zr)!"*'  =  0. 


(3.13b)    TZ  -  (re  +  zr)*(iB  +  zr)  ^  o. 

We  now  calculate  the  asymptotic  Cramer- Eao  bound  for  the  estimator. 
Under  our  assimiptions ,  we  have  a  linear  structural  model  for  an  i.n.i.d. 
specification.  We  do  not  verify  regularity  conditions  here  since  they  have 
been  given  for  this  model  before,  e.g.,  Hood  and  Koopmans  (1953)  or 

^.  See  Hausman  and  lEaylor  (1983). 
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Rothenberg    (1973)-^°      Let   B  =   diag(B^  , . . .  ,Bj^),   ?^  =   [(B"^)^,    0.]',    where   0^ 
is  a  M  X   s •    null  matrix   which   corresponds   to   the   s.    included    predetermined 
variables,    (B~    ).    is   the  matrix   of  columns   of  B~     which   corresponds   to    the   r. 

included   right   hand    side   endogenous   variables,   and   the   normalization   and 

2  2 

exclusions   restrictions   have  been   imposed   on   B.      Let  P  be   the   M     x   M 

premutation  matrix   such   that  vec(A)   =  Pvec(A')    for  any  M-dimensional   square 
matrix   A.      Let   o*  be   the   M(M+1 )/2   dimensional   colimin  vector  of  distinct 
elements  of  2  ,   a*  =    {a^^  , . . .  ,a^^  ,   022»  •  ••  »Ojj2»  •  •  •  t°|,Ij5) '  »   and   let  R  be  the 
[m(M+1)/2]  X  M^  matrix  such  that  vec(2)  =  R'a*  for  all  2    (see  Richard 
(1975))  •     Let  °  be  "the   [m(M+1  )/2]-L  dimensional  vector  of  unrestricted 
elements  of  a*  and  let  S  be  the    ([m(M+1 )/2]-L)   x  M(M+1  )/2  matrix  such  that  a* 

=  S'a  for  all  I.     Let  D.    =   [ll.  ,   I.  J,    where  II.    is  the  columns  of  n 

1     11         1 

corresponding  to  the  included  right  hand  side  endogenous  variables  and  I.  is 
a  selection  matrix  such  that  Z.  =  ZI. .  Let  J)  =  diag(D. ,...,D  ),  and  let  Q  = 
plim  Z'Z/T.  Note  that  plim(Z'X./T)  =  QD.  and  plimU'X./T  =  I^. '. 

Theorem  3-1:   If  the  structure  disturbances  are  normally  distributed  then  the 
information  matrix  for  the  unknown  parameter  vectors  6  and  c   is  given  by 


^^.    The  most  straightforward  approach  to  regxilarity  conditions  is  to  use  the 
reduced  form.  The  reduced  form  has  a  classical  multivariate  least  squares 
specification  subject  to  nonlinear  parameter  restrictions.   Since  the 
likelihood  fxmction  is  identical  for  either  the  structural  or  reduced  form 
specification,  the  more  convenient  form  can  be  used  for  the  specific  problem 
being  considered. 
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(3.14)  J(6,a)  = 


BPB  •  +  plim  1  X '  (z:~  '(x). :  „  )X   B(r.'  !  ,  I^,  )K  's  • 


SRI 


(^^l)B' 


^SF 


Further,  the  inverse  of  the  Cramer-Rao  be  nd  for  c 


:' '  )R's' 


F  =  R(Z-^  © 


.66>-1  ^  •?^./v-1 


^  <  \~''^' 


(5.15)   (J^^)'^  =  I>'{^      ©Q)D  +  2B(i:~'  (x)Ij^)r'[f-   ^-  V'(sFS')-^s]r 


The  first  term  in  equation  (3'15)  is  the  inverse  o:,  '■  l.     '3SLS  asymptotic 

covariance  matrix.   Since  the  second  term  is  positive  semi-definite  3SLS  is 

asymptotically  inefficient  relative  to  FIML,  in  th:  pi-c_>snce  of  covariance 

restrictions. 

Further  insight  into  the  efficiency  gain  from  J-ffivosing  the  covariance 

restrictions  can  be  obtained  by  examining  the  diagoiia'  covariance  matriz 

case.   Let  P..  be  the  ijth  M-dimensional  square  block  of  P,  i,j=1,...,M,  and 

let  P*  be  the  K^  x  M  matrix  P*  =  [P, , ,.o.,P,^]'. 

'-11'     MM-' 


Corollary  3.2:   If  the  structural  disturbances  are  normally  distributed  and"  - 
the  covariance  matrix  is  diagonal  the  information  matrix  for  6  and  a   is  given 
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(3.16)  J(6,  a^^,...,a. 


m^ 


BPB'  +  plim  1  X~^  (Z"^  0  I^)X   BP*I 


-1 


2-1  p*>  B- 


Iz-^z-^ 


Further,  the  inverse  of  the  Cramer-Rao  lower  bound  for  6  is 


(3.17)   (J*^^)"^  =  ^U"^  ®  Q)D  +  1(P  +  Z-^  0  Z  -  2P*P*'  )%' 


The  first  term  in  equation  (3.17)  is  the  inverse  of  the  3SLS  asymptotic 
covariance  matrix.   By  comparing  the  first  term  vri.th  the  second  term  we  can 
easily  see  that  the  larger  is  the  covariance  matrix  Z  of  the  disturbance 
vector  Uo^  relative  to  the  second  moment  matrix  Q  of  the  instruments,  the 
larger  is  the  efficiency  gain  which  can  be  obtained  by  imposing  the 
covariance  restrictions.   For  example,  if  Z  is  multiplied  by  a  scalar  which 
is  larger  than  one  then  the  second  term  is  unaffected  while  the  first  term 
decreases.   In  other  words,  the  efficiency  gain  from  imposing  covariance 
restrictions  will  be  relatively  large  where  the  population  r- squared  for  the 
reduced  form  equations  is  relatively  small,  as  might  be  the  case  in  cross- 
section  data. 
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4.    Instrumental  Variable,  in  the  K-Lquation  Case 

A  convenient  and  efficient  alternative  to  FIKL  can  be  obtained  by- 
utilizing  the  extra  moment  conditions  which  follow  from  the  covariance 
restrictions.   in  the  absence  of  covariance  restrictions  conventional 
instrimental  variable  estimators,  such  as  2SLS  and  3SLS,  make  use  of  the 
fact  that  the  instrimenta]  variables  --re  orthogonal  to  the  disturbances. 
The  covariance  restrictions  add  moraen .  conditions  which  can  also  be  used  in 
estimation.   Let  S  be  an  M^^xL  selection  matrix  of  rank  L  such  that  the 

covariance  restrictions  are  given  by  S'vec(E)  =  0,  where  vec(r)  denotes  the 
usual  column  vectorization  of  Z,  and  let  e,  =  S ' vec(U. 'U. ) .^^   Then  the 
covariance  restrictions  imply  that  the  Lx1  vector  e.  has  mean  zero.   We  can 
account  for  these  additional  moment  restrictions  by  augmenting  the  original 
M  equations  with  the  L  additional  equations 

(4.1)     S'[(y^-X^6)©(y^-X^6)]  =  e^,  (t=1 T), 

where  y^  =  (y^^,  ...,  yjj^)'  and  X^  =  diag[x^^,  ...,  X^^^]. 

These  additional  equations  are  nonlinear  (quadratic)  in  the  parameters. 
When  an  initial  consistent  estimator  6  of  the  parameter  vector  6  is 
available  this  nonlinearity  can  be  eliminated  by  linearizing  the  extra 
equations  around  6 .  Using  the  fact  that  for  an  M-dimensional  square  matrix; 
A  it  is  the  case  that  vec(A')  =  Pvec(A),  and  using  &vec((y .-X ,6) 'U , )/56  = 
8((y^-X^6)  @  U^')/d6  =  -1^(5)  U^'  we  can  calculate  the  first-order  Taylor's 
expansion  of  equation  k  of  (4.1 )  aroxmd  6,  ^ 

^.   The  matrix  S  is  not  unique,  because  of  symmetry  of  I. 
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(4.2)     e^j,  +  Sj^'(l+P)(X^©U^')6  =  S^  '  (I*P)  (X^  ©  U^  '  )6  +  r^ 


kt' 


2  ^ 

where  I  is  an  K  dimensional  identity  matrix,  S  is  the  kth  column  of  S,  U  ' 

=  y  -X,6,  e,,  =  S,  '  (U.  '  @U.'),  r,   is  equal  to  e,,  plus  a  second  order 
term,  and  terms  with  6  are  collected  on  the  right-hand  side.   Let 

^ak  =  (e^i,  +  Sj^'(l+P)(X^  ©U^')6,  ....  e^j^  +  Sj^' (l+P)  (X^(x)  U^' )6 ) ' 

he  the  Tx1  vector  of  observations  on  the  left-hand  side  variable  of  equation 

(4.2)  and  let 

X^j^  =  [(X^'©U^)(l+P)Sj^ (X^'  (i)U^)(I+P)sJ• 
be  the  Tzq  matrix  of  right-hand  side  variables  of  equation  (4.2).  We  can 
then  write  the  observations  for  the  linearized  equation  (4.2)  as 

(4.3)  ^ak  "  ^ak^  *  ^k'  ^^"'' ^^' 

where  r^.  =  (tj^^  , . . .  ,rj^rp)' .   An  estimator  of  6  which  accounts  for  the 
presence  of  oovariance  restrictions  can  now  be  obtained  by  joint  3SLS 
estimation  of  the  L  equations  (4.3)  and  the  original  equation  system  (3.5 )» 
using  a  vector  of  ones  as  the  instrumental  variable  for  equation  (4.2), 
which  estimator  we  will  refer  to  as  A3SLS. 

To  obtain  the  A3SLS  estimator  it  is  convenient  to  stack  the  additional 
equations  (4*3)  which  follow  from  the  covariance  restrictions  with  ^he 
observations  for  the  original  equation  system  to  form  an  augmented  equation 

system.   Let  y^^  =  (y' ,  7^1 ' »  '"■>   ^aL*^'  ^®  '''^^  (M+L)T  dimensional  vector  of 
observations  on  the  left-hand  side  variables  and  X.  =  [x' ,  X  .',...,  X  j']' 
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the  (M+L)Txq  matrix  of  observation;  on  th. 
equations  (3.5)  and  (4.3).  Then  we  can  \r. 
which  adds  the  L  equations  (4.3)  to  the  c- 


-id  side  variables  of  both 
ugraented  equation  system 
luation  system  (3 • 5  )  as 


(4.4) 


Ya  =  X/  +  r, 


where  r  is  equal  to  (u' ,  r^^ ' 


-eL 


•) 


To  form  the  A3SLS  estimator  we  use  Z  t         -Ltrumental  variables  for  each 
of  the  original  equations  of  the  augmente.'  -_:;3tc^">  r.nd  a  Txl  vector  of  ones, 
which  we  will  denote  by  a,  for  each  of  th^  „  id.  i^ical  equations.   The 
corresponding  (M+L)Tx(MK+L)  matrix  of  inst--;.  :  -jcj,  variables  for  the 
augmented  equation  system  will  be  Z.  =  dir.,3_l,.  t3  Z,  I^  @a].   We  also 
estimate  the  covariance  matrix  of  th^  auf'.-"~'t~4  ajstem  Q   = 
e[  (U.  ,e  ' )' (U  ,e  '  )]  using  the  estiE/_tor  ij  ■- 

which  is  formed  from  the  residuals  of  the  linearized  system  evaluated  at 
6=6.   The  A3SLS  estimator  6.  can  then  be  oS  utir.ncd  as 


:/Tte,.VS,.e%')'(u^,S^') 


(4.5)     6^  =  (W'i\\^'\'hV\\^- 


where 


I®Z'Z  C^2®2'a 


T12 


TC 
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and  Q   is  partitioned  conformably  with  (U.,  e,').  .^ 

The  form  of  the  A5SLS  estimator  differs  from  that  of  a  standard  3SLS 
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estimator  in  two  respects.   The  first  is  that  there  are  cross-equation 
restrictions  in  the  augmented  equation  system.   The  parameters  which  enter 
the  original  equation  system  also  enter  the  additional  equations  which  arise 
from  the  covariance  restrictions.   The  second  way  A3SLS  is  different  is  that 
different  instruments  are  used  for  different  equations,  so  that  Z.  and  Vm  do 
not  have  a  Kronecker  product  form.   This  difference  can  be  eliminated  when 
the  matrix  of  instrumental  variables  Z  includes  a  column  of  ones,  as  is 
usually  the  case  in  applications,  by  using  all  the  colimns  of  Z  as 
instrumental  variables  for  the  last  L  equations  of  the  augmented  equation 
system.   The  resulting  estimator,  say  6  ,  could  be  obtained  from  equation 
(4.5)  by  replacing  Z^  by  Ij^^-j^  @  Z  and  V^  by  C  ®  Z'Z.   This  alternative 
estimator  may  be  easier  to  compute  in  some  circumstances  since  a  standard 
5SLS  computer  program  can  be  used.   Because  it  is  3SLS  with  more 
instriments ,  it  will  be  no  less  efficient  asympototically  than  A3SLS, 
although  in  general  it  would  not  be  numerically  equal  to  the  A3SLS  estimator. 

We  have  derived  the  A3SLS  estimator  by  linearizing  the  extra  equations 
which  arise  from  the  covariance  restrictions.  ¥e  could  have  proceeded  by 
linearizing  the  FIML  first  order  conditions  (i.e.  by  obtaining  the 
Rothenberg  and  Leenders  (1964)  linearized  MLE)  but  the  resulting  estimator 
would  be  more  complicated  than  the  A3SLS  estimator  because  of  the  presence 
of  the  parameters  of  the  disturbance  covariance  matrix  for  the  original 
system.   The  A3SLS  estimator  has  the  advantage  that  it  has  a  familiar  form 
and,  at  least  when  the  entire  matrix  Z  is  used  as  the  instrumental ^variables 
for  the  additional  equations,  can  be  implemented  by  using  existing-^oftware 
(e.g.  TSP).  Furthermore,  as  will  be  shown  below,  the  A3SLS  estimator  will 
often  be  efficient  relative  to  PIML  when  U+  is  nonnormal. 

Like  FIML,  the  A3SLS  estimator  has  an  interpretation  as  an 
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instrumental  variables  estimator  for  the  original  equation  system,  where 
residuals  are  used  as  instruments  in  addition  to  Z.   Residuals  can  be  added 
to  the  list  of  instruments  by  allowing  W  =  [z,  U^Q  U)s]  to  be  the  Tx(MK+L) 
matrix  of  instrumental  variables  for  the  original  equation  system  (3-5  )• 
Note  that  S  is  a  selection  matrix,  each  column  of  which  corresponds  to  one 
covariance  restriction  0.  .  =  0  for  i^j.   The  colimm  of  S  which  corresponds  to 
a.   .=0  will  either  select  u.  as  an  instrument  for  equation  j  or  u  .  as  an 
instrument  for  equation  i.   For  example  in  a  two  equation  example  with  the 
restriction  0^2'"'^-''^^^   matrix  S  is  either  (O,  1 ,  0,  O) '  or  (O,  0,  1 ,  0) '  .   If 
S  is  equal  to  (O,  1,  0,  O)'  then 


A 

w  = 


Z  0  u^ 
0  Z  0 


nnfl   u  is  selected  as  an  instrument  for  equation  1 . 

To  obtain  an  instrimiental  variables  interpretation  of  A3SLS  we  need  to 
be  specific  about  what  estimator  of  6  is  used  to  form  the  residual  vector  U. 
For  the  moment  we  will  assume  that  the  covariance  restrictions  are  not 
needed  for  identification  of  the  parameters  in  the  original  equation  system 
(3»1),  and  that  the  initial  estimator  6  is  a  system  instrumental  variables 
estimator  given  by 


-1. 


(4.6)     6  =  (B''z'X)"B'Z'y 


for  some  MKzq  linear  combination  matrix  B.   For  example  if  B'  is  equal  to 

Si' 


~       —1     * 

X'Z(I„0  (Z'Z)~  )  then  6  is  the  vector  of  2SLS  estimators  and  if  B'  is  equal 
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to  X'Z(£"'  (x)  (Z'Z)"')  then  6  is  the  35LS  estimator. 

A 

Theorem  4-1:   If  6  satisifies  equation  (4.5)  then  there  exists  a  (MK+L)xq 
linear  combination  matrix  B*  such  that 


(4.7)    6^  =  (B*'rx)~V'ry 


The  matrix  B*  is  given  in  the  appendix. ■'■^ 

To  obtain  the  asymptotic  properties  of  the  A3SLS  estimator  jit  is 
convenient  to  assime  that  certain  regularity  conditions  are  satisfied.   Let 
ix|  =max^|x^|  for  X  =  (x^  , . . .  ,Xj^) . 

Assumption  4*1:   The  observations  (U.,  Z^)  are  independently  not  identically 
distributed  such  that  there  exists  y,  M  >  0  such  that 


E[|uj4n]  <  K,  E[|ZJ^"T']  <  M,  (t=1,2,...) 


This  assumption  could  be  relaxed  to  allow  dependent  observations  as  long  as 
the  disturbance  vectors  for  different  observations  are  independent. 


^^.   It  can  be  shown  (see  Hausman,  Hewey,  Taylor  (1983))  that  B*  is  the 
optimal  choice  of  a  linear  combination  matrix  to  use  when  iising  residtials  as 
instrumental  variables  for  the  original  equation  system.  This  version  of 
A3SLS  is  complicated  to  implement  because  of  the  use  of  estimated  residuals 
for  instruments.  Due  to  correlation  of  the  distur'^ances  and  the  right-hand 
side  variables  in  the  structural  equations  S' (Ij^®  U)'u//T  does  not^have  the 
same  asymptotic  distribution  as  S' (Ij^  ®  U)'u//T  and,  consequently, "^B* 
depends  on  the  particular  estimator  which  is  used  to  form  the  residuals'. 
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Assumption  4.2:  For  all  t,  U,  ha;  constant  conditional  raw  moments  up  to  the 
fourth  order  which  do  not  depend  on  t,  E[u^]  =  0,  E[e^]  =  0,  plim  Z'Z/T  =  Q, 
and  the  probability  limits  of  averages  of  all  products  up  to  the  fourth 
order  of  elements  of  Z^  and  U^  exist. 

This  assumption  rules  out  heterc"  sdasticity  in  either  the  original  equation 
system  or  the  additional  equatioi..,  which  arise  from  the  covariance 
restrictions,  as  well  as  specifying  that  the  disturbance  vector  U.  has  mean 
zero  and  that  the  covariance  restrictions  are  satisfied.   Let 

G^  =  (lj^®Q)D,  G2  =  S'(l+P)(lj,5®r)  B", 


G     =  [g^-,  G2']', 


V  = 


2@Q  Q^^(^j>lim{Z'a/i:) 


V   •       Q 
M2        22 


Assumption  4«3:   Hie  matrices  Q  and  Q  are  nonsingular,  and  the  matrix  G  has 
rank  q. 

This  assumption  implies  that  V,  which  is  the  asymptotic  covariance  Matrix  of 
the  vector  of  products  of  instrumental  variables  and  disturbances  fo?;^  the 
augmented  equation  system,  is  nonsingular.   This  assumption  also  implies 
that  G,  which  is  equal  to  plim(Z.  'X. /T) ,  has  rank  q.   As  will  be  discussed 
in  Section  5,  rank(G)  =  q  is  a  condition  for  local  identification  of  6  under 
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covariance  restrictions. 

The  following  result  gives  the  asymptotic  distribution  of  the  A3SLS 
estimator. 

Theorem  4.2:   If  Assumptions  4.1  -  4.3  are  satisfied  and  the  initial 

A  A 

estimator  6  is  such  that  /T(6  -  6 )  is  bounded  in  probability  then 

^       d  .        , 

/T(6j^  -  6)  -^  N(0,  (G*V  G)   ) 


and  plim[T(X^'Z^V^"^Z^'X^r^]  =  (G'V-'g) 


1   —  1 
This  result  says  that  (G'V"  G)~  is  the  asymptotic  covariance  matrix  of  the 

A3SLS  estimator  and  that  the  usual  estimator  of  this  asymptotic  covariance 

A 

is  consistent.  The  hypothesis  that  /T(6-6)  is  bounded  in  probability  vri.ll 
be  satisfied  if  6  is  asymptotically  normal.   For  example,  if  the  covariance 
restrictions  are  not  needed  for  identification  then  both  the  2SLS  and  3SLS 
estimators  will  satisfy  this  hypothesis,  while  if  the  covariance 
restrictions  are  necessary  for  identification  then  the  GMM  estimator 
discussed  below  can  be  used  as  an  initial  estimator  when  forming  A3SLS. 

The  important  question  concerning  the  asymptotic  properties  of  the  A3SLS 
estimator  is  its  asymptotic  efficiency.   The  following  result  says  that  the 
A3SLS  estimator  is  asymptotically  efficient  when  the  disturbance  vector  U+  is 
normally  distributed. 

Theorem  4.3;   If  Assumption  4*1  -  4.3  are  satisfied  and  the  distribution  of 
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U,  conditional  on  Z^  is  normal  nen  '^T(6^  -  iSpjjj,^^)  converges  m  probability 


't 

to  zero. 


This  resvilt  has  a  straightforward  explanation.   From  the  instrimental 
variables  interpretation  of  FIK   given  in  Section  3  we  can  see  that  FIML 
uses  the  fact  that  Z  is  orthogr   :"  to  the  disturbances  and  that  c  .  and  z  . 
are  orthogonal  for  a. .=0.   Thib  information  is  exactly  that  which  is  used  to 
form  the  augmented  equation  system  and  the  A3SLS  estimator.   Furthermore,  as 
stated  in  the  following  result,  the  A3SLS  estimator  is  asymptotically 
equivalent  to  the  best  nonlinear  3SLS  (BKL3SLS)  estimator  for  the  augmented 
equation  system  and  is  therefore  efficient  in  the  class  of  nonlinear 

A. 

instrimental  variable  estimators  for  the  augmented  equation  system.   Let  6-n 
be  the  BNL3SLS  estimator  (Amemiy:  (1977))  for  the  augmented  equation  system 
consisting  of  equations  (3-5)  and  (A.I).^^ 


A        A 

Theorem  4.4-:   If  Assumptions  4.1  -  4.3  are  satisfied  then  /T(6.  -  6^) 
converges  in  probability  to  zero. 


The  asymptotic  efficiency  of  the  A3SLS  can  now  he  seen  to  result  from  the 
fact  that  both  FML  and  A3SLS  use  the  same  information,  i.e.  are  members  of 
the  same  class  of  estimators,  and  A3SLS  is  asymptotically  efficient  in  this 
class. 

Theorem  4-4  also  sheds  some  light  on  the  comparison  of  FIML  and  A3SLS 
estimators  when  the  disturbances  of  the  original  equation  system ^do-  not  have 
a  multivariate  normal  distribution.  The  efficiency  of  A3SLS  in  the  class  of 

1  3 

.  A  version  of  this  BNL3SLS  estimator  is  derived  in  the  appendix. 
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instrumental  variable  estimatorf  ''-  the  augmented  equation  system  does  not 
depend  in  any  way  on  normality  b:  -^^  "e  we  have  not  imposed  any  particular 
form  for  the  covariance  matrix  o   '■  augmented  equation  system.   If  the 
disturbance  vector  of  the  origine!'  :  >;^uation  system  is  normally  distributed 
then  third  raw  moments  of  the  di'  ':  .rhances  are  zero  and  fourth  raw  moments 
consist  of  products  of  second  rr    iisents,  so  that  the  covariance  matrix  of 
the  disturbance  vector  of  the  au-    Led  equation  system  is  given  by 


C  = 


2        0 

0  s'[(z  ®  i:)(i+p)]s 


and  the  resulting  form  for  V  is 


V  = 


r@  Q  0 
0     Q 


22 


PIML  imposes  this  special  form  for  V  and  as  a  result  may  be  asymptotically 
inefficient  relative  to  A5SLS  if  U.  is  nonnormal.  The  following  result 
gives  a  necessary  and  sui'ficient  condition  for  FIML  to  be  as  efficient  as 
A3SLS,  which  depends  only  on  the  way  V  and  V  differ. 

Theorem  4.5:   If  Assumptions  4-1  -  4.3  are  satisfied  then  FIML  is  as 
efficient,  asymptotically,  as  A3SLS  if  and  only  if  there  exists  a 
nonsingiilar  q-dimensional  matrix  H  such  that  ^^ 


-1    — 1 
V  G  =  V  GH. 


■^^*  See  Henderson  and  Searle  (1979)  for  the  expression  for  Cpp* 
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FIML  also  imposes  the  special  form  of  V  when  forming  standard  error 
estimates  from  the  inverse  information  matrix  formula  so  that  the  usual 


A 


14 


standard  error  formulae  for  <^ptjijt  may  be  wrong  if  U.  is  nonnormal. 

An  initial  consistent  estimator  of  6  is  required  to  form  the  A3SLS 
estimator.   When  the  covariance  restrictions  are  not  needed  for 
identification  then  an  IV  estimator  Ruch  as  2SLS  c  3SLS  will  suffice.   When 
covariance  restrictions  are  necessary  for  iclentifi.  -  oion  we  must  take  a 
different  approach.   Direct  use  of  the  moment  restrictions  also  provides  a 
useful  approach  to  obtaining  an  initial  estimator  of  6.   Let 


g^j(6)  =  (ljj®Z)'u/T,  £2^(5)  =  S'vec(U'U)/T, 
g^(6)  =  (g^^(6)',  g2T(6)')'. 


Note  that  gm(S)  is  a  MK+L  dimensiona].  vector  of  sample  moments  which  has 
expectation  zero  when  6  is  equal  to  the  true  parameter  value.  It  is  formed 
of  the  usual  MK  dimensional  vector  of  products  of  instrumental  variables  and 
residuals  plus  an  additional  L  dimensional  vector  formed  as  the  sample 
average  of  the  vector  of  residuals  e,  for  the  additional  equations  which 
arise  from  the  covariance  restriction.  A  GMM  estiK?.ator  6^  which  utilizes 
the  covariance  restrictions  can  now  be  obtained  by  minimizing  a  quadratic 
form  in  the  moment  functions  gm(6),  i.e.  by  solving 


(4.8)     minjj  gj(6)'?^gj(5), 


Similar  points  about  the  consequences  of  nonnormality  for  the  properties 
of  FIML  when  covariance  restrictions  are  present  have  been  made  by  others  in 
the  context  of  panel  data,  e.g.,  Chamberlain  (1982). 
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where  D  is  a  subset  of  R^  -:.:.'  I'  is  a  positive  semi-definite  matrix.   A 
similiar  estimation  methov  iias  recently  been  suggested  by  Rothenberg  (1983), 
who  motivates  the  GMM  mevlod  as  a  modified  minimum  distance  approach. 


The  minimization  proT"  Lcm  (4.8)  is  quartic  in  the  parameters  of  6  and 
may  therefore  not  be  very   fficult  to  solve  in  many  circumstances.   One 
computationally  convenien  choice  of  the  matrix  Y™  is  given  by 
¥m=<iiag[l  ,0].   This  choice  is  one  which  minimizes  the  role  of  the  vector  of 
quadratic  fimctions  g2m(6)  in  the  minimization  problem  (4-8)  and  thus  may 

A 

simplify  the  computation  o:"  o™.   For  this  choice  of  Y^  the  minimization 

A 

problem  (4.8)  is  essentially  the  same  as  solving  for  6^  by  setting  the  first 
q  elements  of  gm(6)  to  be  Ci-ual  to  zero  in  estimation,  i.e.,  using  just 
enough  restrictions  to  give  just- identification. 

It  remains  for  us  to  verify  that  6^  will  satisfy  the  hypothesis  of 
Theorem  4.2  and  thus  qualify  as  an  initial  estimator  which  can  be  used  in 
forming  the  A3SLS  estimator. ^^  Let  g(6)=plimgm(6),  which  exists  by  previous 
assumptions. 

Theorem  4.6:   If  Assumptions  4-1  -  4-3  are  satisfied,  the  true  value  of  6  is 
contained  in  the  interior  of  D  which  is  compact,  plim  Y™  =  Y,  Yg(6)  =  0 

A 

only  if  6  is  equal  to  its  true  value,  and  rank(YG)  =  q  then  /T(6^  -  6) 
converges  in  distribution  to  a  normal  random  vector. 


^^.      An  efficient  choice  of  Yrji  ,  which  yields  an  estimator  which  is 
asymptotically  equivalent  to  A3SLS,  is  given  by  Vrj,"  .   It  is  possible  that  in 
small  samples  this  nonlinear  estimator  would  perform  better  than  A3SLS,  which 
is  obtained  by  linearizing. 
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Identification  of  6  is  essential  for  the  existence  of  a  consistent 
estimator.   In  particular,  the  condition  that  ^g(6)=0  have  a  unique  solution 
and  that  rank(yG)  =  q  are  conditions  for  identification  of  6  using  the  moment 
vector  gm(6).   In  the  next  section  we  analyze  identification  using  the  moment 
vector  g_(6),  concentrating  on  the  local  identification  condition  that  G 
have  full  column  rank. 

Finally,  it  is  useful  to  note  that  the  results  of  this  section  do  not 
apply  only  to  the  case  of  zero  covariance  restrictions.   All  of  the  above 
results,  including  asymptotic  efficiency  of  A3SLS  apply  without  modification 
to  the  case  of  linear  homogenous  restrictions  S'vecCZ)  =  0,  where  S  need  not 
be  a  selection  matrix. 
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5.   Identification 

It  is  well  known  that  covariance  restrictions  can  help  to  identify  the 
parameters  of  a  simultaneous  equations  system  (see  the  references  in  Hsiao 
(1983)).   Hausman  and  Taylor (1983)  have  recently  provided  necessary  and 
sufficient  conditions  for  identification  of  a  single  equation  of  a 
simultaneous  system  using  covariance  restrictions,  and  have  suggested  a 
possible  interpretation  of  identification  of  a  simultaneous  system  which  is 
stated  in  terms  of  an  assignment  of  residuals  as  instruments.   In  this 
section  we  analyze  the  local  identification  of  6  using  the  moment  functions 

From  the  derivation  of  equation  4.2  it  follows  that  Bg(6)/66  = 
plim(5grTi(6)/B6)  =  -G,  so  that  rank(G)=q  is  a  natural  condition  for  local 
identification  of  6  from  the  moment  vector  g_(6).  We  can  also  relate  this 
condition  to  familiar  identification  conditions  for  the  structural 
parameters  of  a  simultaneous  equations  system.  As  discussed  by  Eothenberg 
(1971)  nonsingularity  of  the  information  matrix  is  a  necessary  and 
sufficient  condition  for  first  order  local  identification  at  any  regular 
point  of  the  information  matrix.^   Nonsingularity  of  the  information  matrix 
for  the  normal  disturbance  case  is  equivalent  to  the  matrix  G  having  full 
column  rank,  as  stated  in  the  following  resoilt. 


^°.  A  regular  point  of  the  information  matrix  is  one  where  the  information 
matrix  has  constant  rank  in  a  neighborhood  of  the  point.  The  set  of  such 
points  has  measure  one,  Rothenberg  (1971)«  Also,  see  Sargan  (1 983 )^  for  a 
disciission  of  the  relationship  between  identification  and  first  order 
identification.  ^ 
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Lemma  5-1:   If  Q  and  Z  are  nonsingular  then  the  information  matrix  J  is 
nonsingular  if  and  only  if  rank(G)  =  q. 

Thus  the  rank  of  G  plays  a  crucial  role  in  determining  the  identification  of 
the  structural  parameters  of  a  simultaneous  equations  system. 

The  matrix  G  has  an  interesting  structure.   The  first  MK  rows  form  a 
matrix  G.  =  (Ij,,(x)Q)D  =  plim  (Z'X/T)  which  is  familiar  from  the  analysis  of 
identification  via  coefficient  restrictions.   The  covariance  restrictions 
play  a  role  in  determining  the  rank  of  G  through  the  matrix  of  the  last  L 
rows  G  =  S'(l+P)(ljj©r)B'  =  S'(l+P)plim(lj^©  U)'X/T.   The  kth  row  of  G^, 
corresponding  to  the  covariance  restriction  a.  .  =  0,  has  zero  for  each 
element  except  for  the  elements  corresponding  to  6. ,  where  2 .B.  = 
plim(u. 'X. /T)  appears,  and  the  elements  corresponding  to  d . ,  where  I.B.  = 
plim(u. 'X  ./I)  appears.  Thus  the  kth  row  of  G-  contains  both  the  covariance 
of  u.  with  the  right-hand  side  variables  for  equation  i  and  the  covariance 
of  u.  with  the  right-hand  side  variables  for  equation  j.  ¥e  can  exploit 
this  structure  to  obtain  necessary  conditions  for  identification  which  are 
stated  in  terms  of  using  residuals  as  instruments. 

An  assignment  of  residuals  as  instruments  is  a  choice  for  each 
covariance  restriction  a.  .=0  to  either  assign  u •  as  an  instrument  for 
equation  j  _0£  u  .  as  an  instrument  for  equation  i,  "but  not  "both.   Since  for 
each  covariance  restriction  there  are  two  distinct  ways  of  assigning  a 
residual  as  an  instrument  there  are  2  possible  distinct  assignments.   For 
each  assignment  of  residuals  as  instruments,  which  we  will  index  by: 
p=1,...,2  ,  let  U  .  be  the  (possibly  nonexistent)  matrix  of  observations  on 

the  disturbances  assigned  to  equation  i.  Let  W  .  =  [z,  U  . 1  be  the 

^  pi       pi  -' 
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resulting  matrix  of  instrimental  variables  for  equation  i  and  C  .  = 

pliiii(V  .'X./^)  the  population  cross-product  matrix  of  instrumental  variables 

and  right-hand  side  variables  for  equation  i. 


Theorem  5-2      If  rank(G)  =  q  then  there  exists  an  assignment  p  such  that 


(5.1)     rank(C  .)  =  q^,  (i=1,...,M), 


This  result  means  that  for  a  regular  point  a  necessary  condition  for  first 
order  identification  is  that  there  is  an  assignment  of  residuals  as 
instruments  such  that  population  cross-product  matrix  of  instrumental 
variables  and  right  hand  side  variables  has  full  column  rank  for  each 
equation.   Note  that  if  rank(C^)  =  q^^  there  must  be  at  least  q^ 
instrumental  variables  for  the  ith  equation.  We  can  thus  obtain  an  order 
condition  for  residuals  assigned  as  instruments.  Let  a.  =  max(0,  q^-K)  be 
the  number  of  instrumental  variables  which  are  required  for  the  ith  equation 
in  addition  to  Z  to  have  the  same  number  of  instrumental  variables  as  right- 
hand  side  variables. 

Corollaiy  5.3:   If  rank(G)  =  q  then  there  exists  an  assignment  p  such  that 
at  least  a.  residvials  are  assigned  as  instruments  to  equation  i,  i  = 
1 , .. . ,M. 

This  order  condition  says  that  there  must  exist  an  assignment  of  res^iSuals 
as  instrxanents  so  that  there  are  enough  instruments  to  estimate  each 
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equation.   We  can  use  Hall's  Theorei:.  on  the  existence  of  i    system  of 
distinct  representatives,  in  a  similiar  fashion  to  the  use  of  this  theorem 
by  Geraci  (1976),  to  obtain  an  algorithm  for  determining  whether  or  not   such 
an  assignment  of  residuals  as  instruments  exists.   Let  R.  be  the  set  of 
indices  k  of  distinct  covariance  restrictions  such  that  the  kth  covariance 
restriction  is  ^a^-'^   for  some  j?^i.   Note  that  R.  is  just  the  set  of  indices 
of  distinct  covariance  restrictions  vhich  involve  equation  i..   For  a.  >  0 
let  1^.  be  an  a.  tuple  consisting  of  a.  copies  of  R.  and  let  R  equal  the 

r.^^a^  tuple  R  =  (R^,...,Ejj).     - 


to 


Theorem  5.4:   There  exists  an  assignment  of  residuals  as  instruments  such 

that  for  each  i=1,...,M  at  least  a.  residuals  are  assigned  as  instruments 

M 
equation  i  if  and  only  if  for  each  m=1 , . . . ,1 ..a .  the  union  of  any  m 

components  of  E  contains  at  least  m  distinct  indices  k. 


So  far,  each  of  the  identification  results  of  this  section  have  been 
stated  in  terms  of  the  number  and  variety  of  instruments  for  each  equation; 
see  Koopnans  et.  al.  (1950).   It  is  well  known  (see  Fisher  (1966)  pp.  52-56) 
that  when  only  coefficient  restrictions  are  present  the  condition  that 
plim(Z'X./T)  have  rank  q.  can  be  translated  into  a  more  transparent 
condition  on  the  structural  parameters  A  =  [B',r']'.  When  covariance 
restrictions  are  present  we  can  also  state  a  rank  condition  which  is 
equivalent  to  plim(¥  . 'X./T)  having  rank  q. .  For  an  assignment  p,  iet  I  . 
be  the  rows  of  2  corresponding  to  residuals  which  are  assigned  as   ^ 
instrxmental  variables  to  the  ith  equation.   Let  ^-    be  the  (K-1 -q- )x(M+K) 
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selection  matrix  such  that  the  exclusion  restrictions  on  the  ith  equation  can 
be  written  as  <t'^A.  =  0,  where  A-  is  the  ith  column  of  A. 

Lemma  5-5:   For  a  particular  assignment  p  and  an  equation  i,  the  rank  of 
C  ■  equals  qj_  if  and  only  if 


(5.2)     rank[A>.",rp.']  =M-1 


When  this  resiJ.t  is  combined  with  Theorem  5-2  we  can  see  that  Theorem  5-2  is 
a  stronger  necessary  condition  than  Fisher's  (1966,  Theorem  4.6.2) 
Generalized  Rank  Condition,  which  says  that  a  necessary  condition  for 
identification  of  the  ith  equation  is  that  the  rank  of  [A'lfi.  '  ,11.']   is  M-1  , 

where  2.  is  all  the  rows  (Z).  of  I  such  that  a. .=0.   Theorem  5.2  strengthens 
this  necessary  condition  by  requiring  that  the  rank  condition  only  hold  for 
those  rows  of  I  corresponding  to  residuals  which  are  assigned  to  equation  i. 

So  far  we  have  only  presented  necessary  conditions  for  identification. 
We  can  also  give  a  sufficient  condition  for  local  identification  which 
includes  the  recursive  case  of  Proposition  9  of  Hausman  and  Taylor  (1983). 

Theorsn  5.6:   If  for  a  subset  of  covariance  restrictions  there  is  exactly 
one  assignment  p  of  residuals  as  instruments  such  that  rank(C  . )  =  q.  , 
(i=1,...,M),  then  rank(G)  =  q»  ^ 
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It  is  not  known  whether  the  existence  of  an  assignment  p  such  that  the  rank 
condition  (5«1)  is  satisfied  is  sufficient  for  local  identification  when 
there  is  more  than  one  such  assignment  condition,  al thought  we  suspect  this 
to  be  the  case. 

The  previous  results  have  the  virtue  that  they  can  be  checked  on  an 
equation  by  equation  basis.   It  is  possj '.  le  to  obtain  a  necessary  and 
sufficient  condition  for  local  identification  in  terms  of  the  structural 
parameters,  although  this  result  involves  the  restrictions  on  all  the 
equations  and  is  not  readily  interpretable  in  terms  of  instrumental 
variables. 

Theorem  5.7:   The  matriz  G  has  full  column  rank  if  and  only  if 

(5.3)     rank(diag[(^^,...,(t)jj,S'].[lj^©A',(lj,g)l)(l+P)]')  =  M(M-1  ) . 

The  identification  results  of  this  section  are  local  in  nature.   The 
question  of  global  identification  with  general  zero  covariance  restrictions 
is  more  difficult  because  the  moment  i liic bions  gm(S)  are  nonlinear 
(quadratic)  in  the  parameters.   In  fact  Bekker  and  Pollock  (1984)  have 
recently  given  an  example  of  a  system  of  simultaneous  equations  subject  to 
exclusion  and  zero  covariance  restrictions  which  is  locally  but  not 
globally  identified.  Thus  the  problem  of  global  identification  remains 
somewhat  problematical  in  the  general  exclusion  and  zero  covariance 
restriction  case  without  further  restrictions  on  the  parameter  space. 
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MATHEMATICAL  APPENDIX 


Some  properties  of  the  permutation  matrix  P  are  useful  for  deriving  the 
information  matrix.   From  Henderson  and  Searle  (1979)  we  know  that  P  is 
symmetric,  P~  =  P,  and  for  any  M-dimensional  square  matrices  A  and  B 
P(A0  B)=(B0  A)P.   Let  R  be  the  M^  x  (M«(M+1)/2)  matrix  obtained  from  R'  by 
replacing  the  rows  of  R'  corresponding  to  a.  .,  i?tj,  by  1/2  times  the 
corresponding  row  of  R'  and  by  leaving  the  rows  of  R'  corresponding  to  a .  . 
-unchanged  (see  Richard  (1975)).   Then  PR'  =  R',  PR  =  R  2R'R  =  I  +  P,  and 

(R(2'^  ®  I'"' )R' )"^  =  R~(i  ©  i:)R~. 

Proof  of  Theorem  3«1:   An  expression  for  the  information  matrix  which  ignores 
symmetry  of  Z  and  the  covariance  restrictions  on  Z  is  given  in  Hausman 
(1983)»  where  the  notation  is  identical  to  that  used  here  except  that  P  is 
there  denoted  by  E.  Equation  (5-14)  then  follows  by  vec(Z)  =  R'  S  a  and  the 
chain  rule,  using  R(ljj[@  Z)P(ljj  (DZ)R'  =  R(Z(||Z)PR'  =  R(Z©2)R'  and  (I+P)R' 
=  2E'.  To  obtain  the  expression  for  the  Cramer-Rao  bound  note  that 

(A.I)     ^(Z-""  ©Ijj)R'(l2F)-lR(Z-''(x)Ijj)'£' 

=  KZ"""  (i)  Ijj)R'R~(2Z  ©  Z)R~'R(Z-''  ©  3jj)B' 
=  i^(z-''   ©I^)(l+P)(Z(DZ)(l+P)(Z-^®Ijj)'B' 
=  BPB'  +  B(Z  '  (DZ)B'o 
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We   can  also   compute 

(A.2)   piimx'(r-^  §)It)x/t  =  D'(r-^  ©Q)D  +  ^(z"^  ®z)'5'. 

Using  equation  (A.2)  to  obtain  the  upper-left  block  of  the  information  matrix 
and  adding  and  subtracting  equation  (A.1)  to  the  partitional  inverse  formula 
yields  equation  (3.15)- 

Proof  of  Corollary  5«2:   When  the  covariance  matrix  is  restricted  to  be 
diagonal  we  compute  R'S  =  diag(e. ,...,e„)  =  P*  where  e.  is  the  ith  unit 
vector.   Also,  I"  (x)  Z~  is  a  diagonal  matrix,  with  element  (a..)~  in  the 
ith  position  of  the  ith  block,  so  that  the  lower  right  block  of  the 
information  matrix  is  as  given  in  equation  (3.16).  To  obtain  the  upper  right 
block,  note  that  f    d)  Ijj  =  'iiag[(a^^)"  ^m' •  •  • » (^i^)"  ^jj]  >  so  "that 

(2:"^®Ijj)P*  =  diag[(a^^)"^e^,...,(aj^)"''ejj]  =  P*Z"^   The  fom  of  the 
Cramer-Eao  bound  now  follows  from  the  partitioned  inverse  formula. 


Proof  of  Theorem  4-1:   Note  that  I^^.X,  @U  '  =  (l  ®  U) 'X  so  that  for 
2  =  Ij^©Z 


(A.3)   z^'x^  =  {r1,   Z'(lj^©lJ)(l+P)s]' 


=  fx'w  +  [0,X'(lj,@  u)ps]  • 
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S'Pd     (x)U)'X(A'Z'X)'^A'Z'X 


M 


awx 


where 


A  = 


MK  0 

S'P(lj,^@U)'X(A"Z'X)"U'  Ij 


Also   note   that  I^^^    ^i  (S>  ^i  "  vec(U'U)   =    (lj^©U)'u,    so   that 


(A.4)     z;y^     =   (y'Z,    [  (y-X6)  '  (l^^  ®  U)  +  6 'X' (l^^  ©  U)(l+P)]s) ' 
=    (y'Z,   y'(lj^(|)U)S   +  y'ZA(X'ZA)~^X*(ljj©U)PS)' 


=  ^' 


0 
S'P(lj^®U)'X(A'Z'X)"^A'Z'y 


AW'y 


The  conclusion  then  follows  with 


(A. 5)  B*  =  X^'Z^T-^A. 


Proof  of  Theorem  4-»2:  Ity  /T(6-6)  bounded  in  probability  we  have  plim6=6. 
Then  Markov's  weak  law  of  large  numbers  and  the  uniform  bound edness  of  4+Y 

moments  of  the  data  imply  plim[Z^^  e  e  '/T  -  2+^.e  cVt]  =  0. 


Markov's  weak  law  of  large  numbers  and  the  uniform  boundedness  of  fourth 

order  moments  of  the  disturbance  also  imply  plim(r,_.  e  e'/T)  =  ECe.e,')  = 

t— 1   t  t      ^  " t^  t 

Q22*  '^'^   follows  that  plimQ^p  ~  pli^'iC^+.i  e^e,'/T)  =  Q-p.   Similarly  we  can 
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obtain   plimi2  =  Q  and 


(A. 6)      plimZ^'X^/T  =    pliin[x'Z/T,    X' (l^^  (5)  U)  (l+P  )S/t] ' 

=   [pliiii(X'Z/T),    plim(X'(lj.  ©U)/T)(I+P)S]• 
=  [d'(Ijj®q),  b(Ijj(x)i:)(i+p)s]'  =  g 


We  also  have 


(A. 7)      plim  V^/T 


plim(Z)   X   plim(Z*Z/T)      plim^^       x   plim(a'Z/T) 


(plim  V^^2^' 


plimQ 


22 


V. 


"By  nonsingularity  of  Q  and  Q  it   follows  that  V  is  nonsingxilar,    so   that 
plim(V^T)"''    =    (plimV^T)""'    =  V"'' .      By  rank    (G)  =   q,    G'V~"'g  is   nonsingular 
and 


-U.^    ^-1 


.Tr-1^^-1 


(A.8)      plim(;T(x;Z^V-^S;X^)-^)   =    (G'V-^G) 


This  equation  implies  that  the  asymptotic  covariance  matrix  estimator  is 
consistent. 

To  obtain  the  asymptotic  distribution  of  6 . ,  note  that  the  first  MK 
elements  of  Z'(y.  -  X.6)//T  make  up  the  vector  Z'u//T,  and  that  the-last  L 
elements  make  up  the  vector  ^ 


A,  .   ,A 


(A.9)  S*[vec(U'lO  +  (l+P)(l„(5)U)'X(6-6)]//T 
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S'[vec(U,U)   +   vec({U-U}'U)   +   vec(U'{U-U})   + 


vec({U-U}'{U-U})   +    (I+P)(lj,©  U)*X(6-6)]//T 


S'[vec(U'U)   +    (l+P)vec(U'{U-U})   +   vec({U-U} ' {U-U} ) 


+    (l+P)(ljj©U)*X(6-6)]//T 

S'[vec(U'U)   +    (I+P)(lj,j@  {fj-U})'X(6-6)   +  vec({U-U}  *  {U-U}  )]//T 
S'(lj5©U)'u//T  +  S'P[(ljj©{U  -  U})'X/t]/T(6-6). 


.A        . 

^  /T(6-6)  bounded   in  probability  the   second   term  after  the  final  equality  of 
(a. 9)   converges  in  probability  to   zero.     Therefore  we  have 


(A.10)     Z;(y^  -  X/)//T  =  /T  gj(6)  +  0^(1) 

where  gn,(6)  =  [u'Z/T,  u'(I^U)S/t]'.  By   uniform  boundedness  of  4+6  moments 
it  follows  from  Eieorem  3.1  of  White  (1980)  that 


d 
(A.11)  /Tg  „(6)  ^  N(0,V), 


Ihen  by  equations    (A. 6),    (A.?),    (A. 8),    (A. 10),   and   (A. 11),   we  have 


(A.12)     /T(6^-6)  =   [(X^'Z^/T)TV~^(Z^'X^/T)]"''(x;z^/T)TVj^Z^'(y^-X^6)//T 
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(G'V~^G)"^G'V-VTg„(5)  +  0^(1) 
-^       P 


A 

The  asymptotic  distribution  of  6.  now  follows  fron  equations  (A.11)  and 
(A. 12). 


Proof  of  Theorem  4.3:   From  equation  (3.11)  the  instrumental  variable  which 
FDIL  uses  for  y.  in  equation  j  is  ZIT.  +U.b..  =  ZII.  +  US*S*'(B~  ).  where 

(B"'').  here  denotes  the  ith  column  of  B""' ,  (l+P)S  =  S*  =  [s*',...,  2^'],  and 

S* .  is  rixL,  (j=1,...,M).   Using  this  notation  we  can  rewrite  the  FIML  first 
order  condition  for  6  (equation  (3.13a)  as 


(A.13)  D'(2-^  Ql^n'u^'i'iZ-^  01^)S*S*'    vec(U'U)  =0, 

where  S*  =  diag[s*, . . . ,S*] .   From  this  equation  and  Svec(U'U)  =  SPvec(U'U)  it 
follows  that  the  FIML  first  order  condition  can  be  written  as 

(A.14)  Ij  gj(6)  =  0 

where  gn,(6)  =  (u'Z,(vec(U'U)) 'S) ,  A,^  is  a  (MK+L)xq  linear  combination  matriz  .-. 
and  Am  satisfies  plimAm  =  A  for  a  constant  matrix  A.   Using  asymptotic 
argiments  similar  to  those  leading  to  equation  (A. 12)  it  follows  that 


(A.15)  *^T(5j.iML  -  ^)  =  (A'G)"^A/T  s^{&)   +  0^(1). 
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A 

By  equations  (A.11)  and  (A. 12)  6.  is  (asymptotically  equivalent  to)  an 
optimal  GMM  estimator  for  the  moment  functions  gni(6).   By  equation  (A.I  5) 

A  A 

6„y„^  is  also  a  GMM  estimator.   Since  6p™,  is  the  MLE  when  U.  is  normally 

A  A 

distributed,  6_t.,,,  is  no  less  efficient  than  6  ,  so  that  by  Theorem  3.2  of 
Hausman  (1982)  and  Y  =  "^  there  exists  a  nonsingular  q  dimensional  matrix  H 
such  that 


T  _  ?f-1  n  IT 


(A. 16)   A  =T^G   H. 


Asymptotic  equivalence  of  •^-pj.^t  and  6   then  follows  from  V  =  V  and  equations 
(A. 12)  and  (A. 15). 


A 

Proof  of  Theorem  4-4:   Consider  the  estimator  6-g  which  is  obtained  by 
solving 


(A.17)  min  gj(6) •V-^gj(6), 


This  estimator  is  a  nonlinear  3SLS  (NL3SLS)  estimator  where  Z  is  used  as 
instrumental  variables  for  each  of  the  M  original  equations  and  a  for  each  of 
the  L  additional  equations  which  arise  from  the  covariance  restrictions. 
Noting  that  dgj(6)/56  =  -  [x'Z,  X' (l^j  ©  U)(l+P)s]  ,  so  that  by  taking  a 
Taylor's  expansion  of  gL,(6^)  around  6  and  using  the  first-order  condition 
Bgj(6g)/56'v'''gj(6j)  =  0  for  (A.17)  r^l{^^  -  6)  also  satisfies  equati'on 

,      .  A         A  ^    ^         . 

(A.I 2).        Therefore  6.   and  6^  are  as3raiptotically  equivalent. 

A 

To  show  that  6_  is  a  BNL3SLS  estimator,  we  note  that  the  form  of  the 
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optimal  instriments  for  each  equation  is  given  by  the  expectation  of  the 
derivative  of  the  residual  of  that  equation  conditional  on  Z.   It  is  well 
known  that  this  calculation  yields  ZD.  as  the  optimal  instrumental  variables 
for  the  ith  original  equation  (i=1,...,M),  so  that  the  columns  of  Z  contain 
all  the  information  which  is  useful  for  estimating  the  original  equations. 
For  the  kth  additional  equation  note  that 

(A.18)  E[5e^j^  /  d6]  =  E[dSj^' vec  (U^ 'U^)/d6] 

which  is  a  constant  and  does  not  depend  on  the  nonconstant  elements  of  Z, . 
Therefore  a  contains  all  the  information  which  is  useful  for  estimating  the 

A 

additional  equations.  Therefore  ^^is  a  BKL3SLS  estimator. 

Proof  of  Theorem  4-.  5:   Because  A  and  '^~   G  are  a  function  of  only  6  and  Z, 
equation  (A. 16)  also  holds  if  U+  is  nonnormal.   Then  by  Theorem  3>2  of  Hansen 
(1982)  Spj^T  is  asymptotically  as  efficient  as  the  efficient  GM  estimator, 
which  has  the  same  asymptotic  distribution  as  6. ,  if  and  only  if  there  is  a 
nonsingular  matrix  H  such  that 

(a.19)  v''gh  =  a  =  v'''gh, 


which  is  the  same  as  V"''  G  =  V"""  GH  for  H  =  Iffl"''  . 


Proof  of  Theorem  4-6:  By  compactness  of  D  and  uniform  boundedness  of  4+6 
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moments   E[sup | (U    @  Z' ,  (vec  (U    'U    ) ) 'S ) |       '    ]    <   M'    for   some   positive   constants 

6'    and   M'.      Then  by   Lemma   2.5   of  White    (1980)    grp(6)    converges   uniformly   in 
probability   to    g(5)   on   D,    and   therefore   gn,(6 ) '4j^g_,(6)   converges   uniformly   in 

probability  to    g{6)' <iig{6) .      By  hypothesis    g(6)'4jg(6)   has  a   unique  maximum  at 

A 

the  true  value  of  6  in  D.   Then  plim6^  =  6  follows  by  Amemiya  (1973) 

Lemma  3» 

A 

Asymptotic  normality  of  6.  now  follows  from  6  interior  to  D,  which 

A       I        A 

implies   the   first-order   condition  6sp(6 ,  )/a5(|j_g^  (6 ,  )   =  0  is   satisfied   with 

A 

probability  approaching  one  as  T  gets  large,  and  from  expanding  gn,(6,  )  around 
6,  which  implies  that  with  probability  approaching  one 


(A.19)  »^T(6  -  6)  =  -  [Bgj(6  )/66'(|.j6g^(6)/a6]"^5gj(6^)/56'4,^/Tgj(6), 


A  A         • 

where  6  lies  on  the  line  joining  6^,  and  6.   Then  by  plimS^,  =  plim6  =  6  and 

we  can  obtain,  as  in  the  derivation  of  (A. 12), 


(A.20)  /T(6^  -  6)  =  •(G>G)'''G*c|;/Tgj(6)  +  0^(1). 


The  conclusion  then  follows  from  equation  (A.11)> 


Proof  of  Lemma  5-1 5   Nonsingularity  of  Z  and  Q  implies  that  '^  is  nonsingular. 

,  .  A  A  ^ 

When  rank(G)  =  q  the  asymptotic  equivalence  of  ^yrm^  ^^^  ^t   implies  — 
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J  =  G'V"  G.   It  can  also  be  shown  by  some  matrix  algebra,  which  is  available 

upon  request  from  the  authors,  that  this  equality  continues  to  hold  when 

—  1 
rank(G)  <  q»   The  conclusion  then  follows  by  V   positive  definite. 

Because  rank(G)  =  q  if  and  only  if  there  is  a  nonsingular  q-dimensional 

submatrix  of  G  we  can  assume  without  loss  of  generality  that  G  is  square, 

which  simplifies  the  identification  proofs.   The  following  Lemma  will  prove 

useful.   For  a  partici^Lar  assignment  p  let  C  =  diag[c   ,...,C   ]. 


Lemma  A:   For  some  2^-tuple  of  positive  integers  (X  ,...,J5,„L) 

det(G)  =  r^  ,  (-1)  '^Pdet(C  ). 
p=1  ^  p ' 


Proof:   Let  the  rows  of  Gp  be  denoted  by  Sj^,  lc=1,...,L.   Each  k  corresponds 
to  a  restrictrion  6.  .  =  0  for  some  i^^j.   Further,  each  s,  is  a  sum  of  two  Ixq 
vectors,  s,  .  +  s,  .  where  Sj.  has  plimCu'J  ./T)  for  the  subvector  corresponding 
to  6.  and  zeros  for  all  other  subvectors  and  s,  .  has  plim(u. 'X  ./T)  for  the 

1  KJ  1    J 

subvector  corresponding  to  6  .  and  zeros  for  all  other  subvectors.  We  en 

J 

identify  s,  .  with  an  assignment  of  residual  j  to  equation  i  and  s,  .  with  an 
assignment  of  residual  i  to  equation  j.     We  have 


G  = 


(Ijl  ©Q)  D 


s.  .   +  s,  . 


Li     Lj 


where  we  drop  k  subscript  on  i  and  j  for  notational  convenience.   For  each  of 
the  2  distinct  assignments,  indexed  by  p,  let 
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(Ij,®  Q)D 


^    P 


where  "s  is  the  Lxq  matrix  which  has  its  kth  row  s   if  u.  is  assigned  to 
p  ^1     J 

equation  i  or  Sj^.   if  u-  is  assigned  to  equation  j.   The  determinant  of  a 
matrix  is  a  linear  function  of  any  particular  row  of  the  matrix.   It  follows 
that  if  L  =  1 

(5.18)   det(G)  =  det(^  )  +  det(S  ). 


Then  induction  on  L  gives  det(G)   =       Z     det(G    ) 


'HJ 


How  consider  "S  for  each  p.  The  matrix  (I  (x)Q)D  is  block  diagonal,  where 

p  i^ 

the  column  partition  corresponds  to  6. for  i=1,...,H,  and  the  ith  diagonal 

"block  is  plim  Z'X./T.  Further  the  kth  row  of  s  consists  of  zeros  except  for 

the  subvector  corresponding  to  S.where  plim(u^'X  ./T)  appears.   Then  by     ..,^, 

interchanging  pairs  of  rows  of  "8  ,  we  can  obtain  "6  from  "2  .  That  is,  "S  = 

P  P      P  P 

E  "8  ,  where  E  is  a  product  of  matrices  which  interchange  a  pair  of  rows  of 
P  P        P 

G  .  Note  that  E  satisfies  E  'E  =1,  so  that  det(E  )  =  (-1  )  p  for  X  equal 
P  P  P  P  P  P 

~         JJ.     ~  ^^^ 

to  1  or  2.   It  follows  that  det(G  )  =  (-1  )  p  det(C  ).  Then  since  for  each 

p  ^  P 

p,   det(G    )  =    (-1)     p  det(C   ),   det(G)  =  Z     det(G    )  =     Z      (-1)     p  det(C   ) 
P  P  p=1  P         p=1  P 
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Proof  of  Theorem   5.2:      If   rank(G)    =    q   then   by   Lemma   A1    rank(C    )    =    q   for   some 

•D.      Since   C     is  block  diagonal,    with  diagonal   blocks   C   . , (i=1 , . . . ,M) ,   we   have 
•^  p  Pi 

T^        rank(C    . )    =    q.      Each   C    .    has    q.    columns    so    that   rank    (C    . )    <    q. ,    and 
•^i=1    ■'^"^^    pi  ^        ^  pi  ^1  pi  ^1 ' 


consequently  rank(C   )  =  q^ , (i=l , . . . ,H 


Proof  of  Theorem  5«4:   Hall's  Theorem  (see  Geraci  (1976))  states  that,  for  a 
set  S,  a  collection  of  subsets  S>,...,S^  has  an  associated  H- tuple 
(s^,.-.,s„)  of  distinct  elements  of  S  with  s-  e  S.,  (i=1,...,N),  if  and  only 
if  for  each  n=1,...,N  the  union  of  every  n  subsets  S.  contains  n  distinct 
elements  of  S.   An  assignment  of  residuals  such  that  at  least  a.  residuals 
are  assigned  to  each  equation  corresponds  to  a  vector  (s.,...,e,,)  of  distinct 
elements  of  {1,...,L}  such  that  each  copy  of  R.  has  one  corresponding  element 

m 

Each  distinct  k  selected  from  a  copy  of  R.  corresponds  to  a  residual  u. 
assigned  as  an  instrumental  variable  for  equation  i,  where  the  kth  covariance 
restriction  is  a. .  =  0.   The  proof  then  follows  immediately  from  Hall's 
Theorem. 

Proof  of  Lemma  5-5:  We  drop  the  p  subscript  for  notational  convenience.  We 
also  assume  i=1 .   Note  that  the  first  column  of  Z.    consists  entirely  of 
zeros,  since  to  qualify  as  an  instrument  for  the  first  equation  a  disturbance 


u.  must  satisfy  E(u^u.)  =  o.  .  =  0.  Let  e^  be  an  M  dimensional  unit  vector 
with  a  one  in  the  first  position  and  zeros  elsewhere.  Then  (fiA.  =^O^and  the 
covariance  restrictions  imply  Fe.  =  0  where  I   =  (A'(|) '  ,11  '  )' .   Note  that  rank 
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-,-1 


-,-1 


(FB~  )  =  rank:(F).   Also  FB~  B  =  Fe.  =  0  where  B.  is  the  first  column  of  B,  so 
that  the  first  column  of  F  is  a  linear  combination  of  the  other  columns  of  F 
by  B-|.=  U   Let  T.    be  the  rows  of  T   corresponding  to  the  excluded 
predetermined  variables.   Then  4>AB~  =  [e  ',(B')~  F  ']'  where   E.  is  an  (M- 
1-r^)xM  matrix  for  which  each  row  has  a  one  in  the  position  corresponding  to 
a  distinct  excluded  endogenous  variable  and  zeros  elsewhere.   Let  (B~  )  be  . 


the  columns  of  B~  corresponding  to  included  right-hand  side  endogenous 

E. 


variables.   Note  that  FB  = 


.-1 


1 


-1 
-1 


Then  row  reduction  of  FB   using  the  rows  of  E  ,  and  the  fact  that  the  first 


column  of  FB   is  a  linear  combination  of  the  other  columns  imply 


-1 


(B.9)    rank(FB  )  =  rank 


+  K-1-r, 


■,-1' 


Now  consider  C,.   Note  that  for  any  j*^ ,    plim  u.'X./T  =  [l.(B~  ),,0  ],  where 
0^  is  a  Ixs^  vector  of  zeros  and  2  .  is  the  jth  row  of  Z.  By   C^  non-singular 


(B.10) 


rank(C. ) 


rank  ([g   °  ]  [plim  ^(u.'X^/T)]) 

rank  [^^      (^-l  )^   ^^  ]  =  ranlc  [^^  ^^'^^^      ^^    ] 


Ity  column  reduction,  using  the  columns  of  [li  '  Oi'j't  equation  (B.t-0^) 
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implies 


(B.11)       rank(C^)  =  rank  [j.    ^^"^^  ]  +  s, 


Then  equations  (B.9)  and  (B.11)  imply  K-l-rank(F)  =  q^-rank(C^),  from  which 
the  conclusion  of  the  proposition  follows. 

Proof  of  Theorem  5*6:   Note  that  for  any  assignment  p  with  rank(C  ■)  <  q.  for 

some  i  it  follows  that  det(C  )  =  0.   Then  the  conclusion  follows  from  Lemma 

A1 ,  since  the  determinant  of  G  is  nonzero  because  exactly  one  of  the 

determinants  det(C  )  is  nonzero. 
P 

Proof  of  Theorem  5-7:   This  proof  follows  closely  the  proof  of  Lemma  5»4. 
Let  P  =  diag((^^ '''m''^'^*  ^^M®^''  (^m  ®  ^^  ^^^^  ^^ '  * 


Post-multiplication  of  P  by  L  @  B~  and  row  reduction  using  E. ,  i=1,...,M  as 
in  the  proof  of  Lemma  5 -4  gives 


.  M       p    M 

(B.12)   rank  P  (l^^  @  B"  )  =  rank(G)  -25+  M  -M-  I  r  =  (rank(G)-q) 

i=1  i       i=1  i 
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